home *** CD-ROM | disk | FTP | other *** search
- Path: kuhub.cc.ukans.edu!anh
- From: anh@kuhub.cc.ukans.edu
- Newsgroups: comp.lang.c
- Subject: unique id for a string
- Message-ID: <1996Feb16.175601.114182@kuhub.cc.ukans.edu>
- Date: 16 Feb 96 17:56:01 CST
- Organization: University of Kansas Academic Computing Services
-
-
- Hello,
-
- This is not a quite a C problem. But since the implementation will be in
- C anyway.
-
- Given an alphabet, for example, E = {A-Z}, I need to assign an unique id
- to each possible strings (finite, approx 32 chars long, or whatever, but
- it is finite) formed by the letters from E. I would like the id to be as
- small as possible. Is there a good formula?
-
- Say the alphabet consists of 26 letters. A=1, B=2, ..., Z=6. One formula is:
-
- HELLO = H * 26^4 + E * 26 ^3 + L * 26^2 + L * 26^1 + E * 26^0
-
- But the id is way too big. I need something that fits within a 32-bits int
- or a 64-bits long.
-
- Another way is to simply assign a id to a word and store the word in a
- binary search tree, and therefore, it is relatively fast to make sure
- each word has an unique id. But I think it is still to slow. We are
- talking about a possible 200,000+ words.
-
- I would appreciate hints/pointers/advice/solutions. Thank-you.
-
- Anh
-